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Optimal estimation with missing observations 
via balanced time-symmetric stochastic models 

Tryphon T. Georgiou, Fellow, IEEE and Anders Lindquist, Life Eellow, IEEE 


Abstract —We consider data fusion for the purpose of smooth¬ 
ing and interpolation based on observation records with missing 
data. Stochastic processes are generated by linear stochastic 
models. The paper begins by drawing a connection between 
time reversal in stochastic systems and all-pass extensions. A 
particular normalization (choice of basis) between the two time- 
directions allows the two to share the same orthononnalized 
state process and simplifies the mathematics of data fusion. 
In this framework we derive symmetric and balanced Mayne- 
Fraser-like formulas that apply simultaneously to smoothing and 
interpolation. 

I. Introduction 

Data fusion is the process of integrating different data sets, 
or statistics, into a more accurate representation for a quantity 
of interest. A case in point in the context of systems and 
control is provided by the Mayne-Fraser two-hlter formula 
m, El in which the estimates generated by two different 
biters are merged into a combined more reliable estimate 
in bxed-interval smoothing. The purpose of this paper is to 
develop such a two-blter formula that is universally applicable 
to smoothing and interpolation based on general records with 
missing observations. 

In El. Bl the Mayne-Fraser formula was analyzed in the 
context of stochastic realization theory and was shown that 
it can be formulated in terms a forward and a backward 
Kalman biter. In a subsequent series of papers, Pavon 0, 
||6l addressed in a similar manner the hitherto challenging 
problem of interpolation 0, 0, 13, ifTOl . This latter prob¬ 
lem consists of reconstructing missing values of a stochastic 
process over a given interval. In departure from the earlier 
statistical literature, a, 0 considered a stationary process 
with rational specbal density and, therefore, reliazable as the 
output of a linear stochastic system. Interpolation was then 
cast as seeking an estimate of the state process based on an 
incomplete observation record. A basic tool in these works 
is the concept of time-reversal in stochastic systems which 
has been central in stochastic realization theory (see, e.g., 
ifTTl . ca, Ea, 03, 0, 0, Ea, Ea, ini)- For a recent 
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overview of smoothing and interpolation theory in the context 
of stochastic realization theory see ESl Chapter 15]. 

In the present paper we are taking this program several 
steps further. Given intermittent observations of the output 
of a linear stochastic system over a bnite interval, we want 
to determine the linear least-squares estimate of the state 
of the system in an arbitrary point in the interior of the 
interval, which may either be in a subinterval of missing 
data or in one where observations are available. Hence, this 
combines smoothing and interpolation over general patterns of 
available observations. Our main interest is in continuous-time 
(possibly time-varying) systems. However, the absence of data 
over subintervals, depending on the information pattern, may 
necessitate a hybrid approach involving discrete-time bitering 
steps. 

In studying the statistics of a process over an interval, it is 
natural to decompose the interface between past and future in 
a time-symmetric manner. This gives rise to systems represen¬ 
tations of the process running in either time direction, forward 
or backward in time. This point was fundamental in early work 
in stochastic realization; see m and references therein. In a 
different context E3 a certain duality between the two time- 
directions in modeling a stochastic process was introduced in 
order to characterize solutions to moment problems. In this 
new setting the noise-process was general (not necessarily 
white), and the correspondence between the driving inputs to 
the two time-opposite models was shown to be captured by 
suitable dual all-pass dynamics. 

Here, we begin by combining these two sets of ideas 
to develop a general framework where two time-opposite 
stochastic systems model a given stochastic process. We study 
the relationship between these systems and the corresponding 
processes. In particular, we recover as a special case certain 
results of stochastic realization theory im, 0,0, a from 
the 1970’s using a novel procedure. This theory provides a 
normalized and balanced version of the forward-backward 
duality which is essential for our new formulation of the 
two-blter Mayne-Fraser-like formula uniformly applicable to 
intervals with or without observations. 

The paper is structured as follows. In Section |I^ we 
explain how a lifting of state-dynamics into an all-pass system 
allows direct correspondence between sample-paths of driving 
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generating processes, in opposite time-directions, via causal 
and anti-causal mappings, respectively. This is most easily 
understood and explained in discrete-time and hence we begin 
with that. In Section |III] we utilize this mechanism in the 
context of general output processes and, similarly, introduce a 
pair of time-opposite models. These two introductory sections, 
[II| and III deal with stationary models for simplicity and 
are largely based on ll20l . The corresponding generalizations 


to time-varying systems are given in Section IV and in the 
appendix, in continuous and discrete-time, respectively. In 
Section |V] we explain Kalman filtering for problems with 
missing information in the continuous-time setting. In this, 
we first consider the case where increments of the output 
process across intervals of no information are unavailable as 
a simplified preliminary, after which we focus on the central 
problem where the output process is the object of observation. 
Section VI deals with the geometry of information fusion. 
In Section VII we present a generalized balanced two-filter 
formula that applies uniformly over intervals where data is or 
is not available. We summarize the computational steps of this 
approach in Section |VIII| Finally, we highlight the use of the 
two-filter formula with a numerical example given in Section 
IX and provide concluding remarks in Section 


this is identical to the one for discrete-time given above (as 
is well known). In continuous time, stability of the system 
of equations is equivalent to A having only eigenvalues with 
negative real part. 

In either case, discrete-time or continuous-time, it is pos¬ 
sible to define an output equation so that the overall system 
is all-pass. This is done next. 


A. All-pass extension in discrete-time 

Consider the discrete-time Lyapunov equation 

P = APA' + BB'. (6) 

Since A has all eigenvalues inside the unit disc of the complex 
plane and Q holds, has as solution a matrix P which is 
positive definite. The state transformation 

^ = P-"^x, (7) 

and 

F = p-iApi, G = P-^B, (8) 

brings Q into 


II. State dynamics and all-pass extension 


at+i) = Fm + Gw{t). ( 9 ) 


In this paper we consider discrete-time as well as 
continuous-time stochastic linear state-dynamics. We begin by 
explaining basic ideas in a stationary setting. In discrete-time 
systems take the form of a set of difference equations 

x{t-\-1) = Ax{t)Bw{t) (1) 

where t G Z, A G A has all eigenvalues 

in the open unit disc D = {z | |z| < 1}, and w{t),x{t) are 
(centered) stationary vector-valued stochastic processes with 
w{t) normalized white noise; i.e., 

E{r(;(f)ui(s)'} = IpSts, (2) 

where E denotes mathematical expectation. The system of 
equations is assumed to be reachable, i.e.. 


For this new system, the corresponding Lyapunov equation 
X = FXF' -\- GG' has as solution, where denotes the 
(n X n) identity matrix. This fact, namely, that 

= FF' + GG' (10) 


implies that this [F, G] can be embedded as part of an 
orthogonal matrix 


U = 


F G 
H J 


i.e., a matrix such that UU' = U'U = In+p- 
Define the transfer function 


( 11 ) 


\]{z)-.= H{zIn-F)-^G + J (12) 


rank \B, AB, ... A” ^B] = n. 


(3) corresponding to 


In continuous-time, state-dynamics take the form of a 
system of stochastic differential equations 

dx(t) = Ax{t)dt -\- Bdw{t) (4) 

where, here, x{t) is a stationary continuous-time vector-valued 
stochastic process and w{t) is a vector-valued process with 
orthogonal increments with the property 

Fj{dwdw'} = Ipdt, (5) 

where Ip is the p x p identity matrix. Reachability of the 
pair {A, B) is also assumed throughout and the condition for 


^{t-\-1) = F^{t)Gw{t) (13a) 

wit) = H^{t)Jw{t). (13b) 

This is also the transfer function of 

x{t-G 1) = Ax{t)-\-Bw{t) (14a) 

w{t) = B'x{t) -G Jw{t), (14b) 

where B := P~^H', since the two systems are related by a 
similarity transformation. Hence, 

\]{z)=&{zIn-A)-^B + J. (15) 

















Now, using the identity 
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We claim that U(z) is a stable all-pass transfer function (with 
respect to the unit disc), i.e., that U(z) is a transfer function 
of a stable system and that 

u{z)u(z-^y = u{z-^yu{z) = Ip. (i6) 

The latter claim is immediate after we observe that, since 

f/'c/ = 4+p, 


^{t + 1) 


■ m ■ 

. w{t) _ 


. w{t) _ 


and hence. 


C{t) = F'^{t + 1) + H'w{t) 

(17a) 

w{t) = G'^{t + 1) -1- J'w{t) 

(17b) 

or, equivalently. 


x{t) = PA'P~^x{t + 1) + P^II'w{t) 

(18a) 

w{t) = B'P~^x{t + 1) + J'w{t). 

(18b) 

Setting 


x{t) := P~^x{t -\- 1), 

(19) 

([T^ can be written 


x{t — 1) = A'x{t) + Bw{t) 

(20a) 

w{t) = B'x{t) -\- J'w{t) 

(20b) 

with transfer function 


U(z)* = B'{z-^In - A')-^B + J'. 

(21) 


In - FF' = {zin - - F') 

+ izI^-F)F' + F{z-^I^-F'), 

( [T0| ) and GJ' = —FH', obtained from UU' = In+p, this 
yields 

U(z)U(z-i)' = HH' + Jf = /„+p, 

as claimed. 


B. All-pass extension in continuous-time 


Consider the continuous-time Lyapunov equation 

AP + PA' -f BB' = 0. (22) 


Since A has all its eigenvalues in the left half of the complex 
plane and since © holds, ( |22] l has as solution a positive 
definite matrix P. Once again, applying (7][8i, the system in 
becomes 


di{t) = F^{t)dt + Gdw{t). 


(23a) 


We now seek a completion by adding an output equation 

dw(t) = Hy{t)dt -f Jdw{t) (23b) 

so that the transfer function 

U(s) := iJ(s4 - F)-^G + J (24) 

is all-pass (with respect to the imaginary axis), i.e., 

U(s)U(-s)' = U(-s)'U(s) = Ip. (25) 


Either of the above systems inverts the dynamical relation For this new system, the corresponding Lyapunov equation 
w ^ w (in ( [T4l l or (HD). has as solution the identity matrix and hence. 




Fig. 1; Realization ([l4|l in the forward time-direction. 




Fig. 2: Realization (|20li in the backward time-direction. 


An algebraic proof of ([T^ is also quite immediate. In fact, 

U(z)U(2-i)' 

= [Hizin - F)-^G + J] [H{z-^In - F)-^G + J]' 
=H{zIn - F)-^GG'{z-^In - F')-^H' + JJ' 

+ H{zln - F)-^GJ' + JG'{z-^In - F')-^H 


F + F' + GG' = 0. (26) 

Utilizing this relationship we note that 

{sin - F)-^GG'{-sIn - F')-^ 

= {sin - F)-\sIn -F-sIn- F'){-sIn - F')”' 

= {sIn-F)-^ + {-sIn-F')-\ 

and we calculate that 

U(s)U(-s)' 

= {H{sln - F)-^G + J){G'{-sIn - F')-^H' + J') 

= JJ' + H{sln - F)-\GJ' + H') 

{JG' + H){-sIn-F')-^H'. 

For the product to equal the identity, 

JJ' = Ip 
H = -JG'. 
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Thus, we may take 


j = ip 
H = -G', 

and the forward dynamics 

d^{t) = F^{t)dt + Gdw{t) (27a) 

dw{t) = —G'^{t)dt + dw(t). (27b) 

Substituting F = —F' — GG' from ( |26] l into ( |27a| i we obtain 
the reverse-time dynamics 

d^{t) = —F'^{t)dt + Gdw{t) (28a) 

dw{t) = G'^{t)dt + dw{t). (28b) 

Now debning 

x{t) := P~^x{t) (29) 

and using Q and ([^, ( |28l l becomes 

dx{t) = —A'x{t)dt + Bdw{t) (30a) 

dw{t) = B'x{t)dt + dw{t), (30b) 

with transfer function 

Vis)* =Ip + B'isIn + A')-^B, (31) 

where 

B := P-^B. (32) 

Furthermore, the forward dynamics ( |27| ) can be expressed in 
the form 

dx{t) = Axit)dt F Bdwit) (33a) 

dw{t) = B'x{t)dt + dw{t) (33b) 

with transfer function 

U(s) = Jp - &isln - A)-^B. (34) 


III. Time-reversal of stationary linear 

STOCHASTIC SYSTEMS 

The development so far allows us to draw a connection 
between two linear stochastic systems having the same output 
and driven by a pair of arbitrary, but dual, stationary processes 
w{t) and wit), one evolving forward in time and one evolving 
backward in time. When one of these two processes is white 
noise (or, orthogonal increment process, in continuous-time), 
then so is the other. For this special case we recover results 
of ED and ID, 161 in stochastic realization theory. 


A. Time-reversal of discrete-time stochastic systems 


Consider a stochastic linear system 

+ 1) = Axf) + Bwit) 
yit) = Gxit) + Dwif) 


(35a) 

(35b) 


with an m-dimensional output process y, and x, u, A, B are 


defined as in Section II-A All processes are stationary and 
the system can be thought as evolving forward in time from 
the remote past (t = —oo). 

To formalize this, we introduce some notation. Let H be 
the Hilbert space spanned by {wkit); t G Z, k = 1,2,..., n}, 
endowed with the inner product (A,/i) = E{A^}, and let 
H^(r(;) and H^(u>) be the (closed) subspaces spanned by 
{w/c(s); s < t — 1, k = 1,..., m} and {wkis); s > t, k = 
l,...,m}, respectively. Dehne H("(y) and H+( 2 /) accord¬ 
ingly in terms of the output process process y. Then the 
stochastic system ( |35] l evolves forward in time in the sense 
that 

H,-(z)cH,-(u;)TH+(u;), (36) 

where A _L B means that elements of the subspaces A and 
B are mutually orthogonal, and where H^( 0 ) is formed as 
above in terms of 


zit) = 


vit + l)' 


[ yit) J’ 

see ESl Chapter 6] for more details. 

Next we construct a stochastic system 

xit — 1) = A'xif) + Bwit) 
yit) = Gxit) + Dwif), 


(37a) 

(37b) 


which evolves backward in time from the remote future 
{t = oo) in the sense that the processes x, x, w, w relate as in 
the previous section. More specifically, as shown in Section 

IL^ H-(w)) C n-iw) and 11+iw) C H+(u)) for all t, as 


examplihed in Figures and 

In fact, the all-pass extension ( [T4l l of ( |35a| ) yields 

wit) = B'xf) + Jwf) (38) 


It follows from ( |20b| ) that ( [38] l can be inverted to yield 

wit) = B'xit) + J'wit), (39) 

where xit) = P~^xit+ 1), and that we have the reverse-time 
recursion 


xit — 1) = A'xit) + Bwit). 

Then inserting ( [39l l and 

xit) = Pxit — 1) = PA'xit) + PBwit) 
into ( |35b| ), we obtain 

yit) = Gxit) + Dwif), 


(40a) 


(40b) 
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where D := CPB + DJ' and 

C := CPA' + DB'. (41) 

Then, ( |40l l is precisely what we wanted to establish. 

The white noise w is normalized in the sense of Since 
U, given by (ED, is all-pass, w is also a normalized white 
noise process, i.e., 

E{u;(f)w(s)'} = IpSt-s- 
From the reverse-time recursion ( |37a| l 

OO 

x{t) = Bw{k). 

Since, w is a white noise process, E{a:(<)r(}(s)'} = 0 for all 
s < t. Consequently, ( |J7] i is a backward stochastic realization 
in the sense defined above. 


same inner product as above, and let (du) and (du) 
be the (closed) subspaces spanned by the increments of the 
components of U on (—oo,f] and respectively. Define 

{dy) and 'H.f{dy) accordingly in terms of the output 
process y. All processes have stationary increments and the 
stochastic system evolves forward in time in the sense 
that 

H-(dz) C li-{dw) _L H+(fiw), 


M-t ^ yUjiXJ) -L. 3.3.^ 

where {dz) is formed in terms of 

'x{t) 


z(t) = 


.yit). ' 


The all-pass extension of Section II-B yields 


Moreover, the transfer functions 

W{z) = C{zIn-A)-^B + D 

of ( |T5] | and 

W(z) = C{z-^In - A')-^B + D 
of (|T7 ]i satisfy 


dw = dw — B'xdt 
as well as the reverse-time relation 

dx = —A'xdt + Bdw 
dw = B'xdt + dw, 


(46) 

(47) 

(48) 

(49a) 

(49b) 


(42) 


(43) 


where x{t) = P ^x{t). Inserting 


into 


yields 


where 


dy = CPxdt + Ddw 
dy = Cxdt + Ddw, 


W(2) = W(z)U(z). 

(44) 

C = CP + DB'. 

(50) 

In the context of stochastic realization theory, U(z) 

is called 

Thus, the reverse-time system is 


structural function (II13I. 11411. 



dx = —A'xdt -\- Bdw 

(51a) 

w{t) 


yit) 


dy = Cxdt -f Ddw. 

(51b) 

-► 

w 

- ► 


From this, we deduce that the system (|45]l has 

the backward 





property 


Fig. 3; The forward stochastic system (|T5]l. 


U+idz) cH+idw) PUfidw), 

(52) 


where {dz) is formed as above in terms of 


yit) 


W 


w{t) 


z{t) = 


x{t) 


Fig. 4; The backward stochastic system 

B. Time-reversal of continuous-time stochastic systems 
We now turn to the continuous-time case. Let 

dx = Axdt -\- Bdw (45a) 

dy = Cxdt -\- Ddw (45b) 


be a stochastic system with x, w. A, B as in Section |II-B| 
evolving forward in time from the remote past {t = —c»). 
Now let H be the Hilbert space spanned by the increments 
of the components of w on the real line K, endowed with the 


.yit). ' 

We also note that the transfer function 

W{s) = C{sln - A)-^B + D 

of ( |45| l and the transfer function 

W(s) = C{sln + A')-^B + D 
of ( |5 T] i also satisfy 

W(s) = W(s)U(s) 

as in discrete-time. 

Note that the orthogonal-increment process w is normal¬ 
ized in the sense of (|^. Since U(s) is all-pass, 

dw = du — B'xdt (53) 
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also defines a stationary orthogonal-increment process w such 
that 

{dw{t)dw(ty} = Ipdt. 

It remains to show that is a backward stochastic real¬ 
ization, that is, at each time t the past increments of w are 
orthogonal to x{t). But this follows from the fact that 

/ OO 

e-^'^*-^'>Bdw{s) 

and w has orthogonal increments. 

IV. Time reversal oe non-stationary stochastic 

SYSTEMS 

In a similar manner non-stationary stochastic systems 
admit unitary extensions which in turn allows us to construct 
dual time-reversed stochastic models that share the same state 
process. The case of discrete-time dynamics is documented 
in the appendix, whereas the continuous-time counterpart is 
explained next as prelude to smoothing and interpolation that 
will follow. 


Differentiating P{t) ^P{t)P{t) 2 = we obtain 
P(f)-5PP(<)-2 = -R[t) - R{t)', 
and hence the ( |55| l yields 

F{t) + F{t)'+ G{t)G{t)'= Q. (61) 

Using (|6B to eliminate P in ( |57l i. we obtain 

d^ = —F{t)'^{t)dt + G(t)dw, (62) 

where 

dw = dw — G{ty^{t)dt, (63) 

which can also be written 

dw = dw — B{ty x(t)dt, (64) 

where B{t) := P{t)~^B{t). 

Proposition 1: A process w satisfying ( |6B has orthogonal 
increments with the normalized property Q- Moreover, 

F{[w{t) - w{s)]^{ty} = 0 (65) 

for all s < t. 


A. Unitary extension 

The covariance matrix function P(f) := E{a;(f)a;(f)'} of 
the time-varying state representation 

dx = A{t)x{t)dt B{t)dw, a;(0) = Xq (54) 

with Xq a zero-mean stochastic vector with covariance matrix 
Po = E{a:otCo}, satishes the matrix-valued differential equa¬ 
tion 

Pit) = Ait)P{t) + Pit) Ait)' + Bit)Bity (55) 

with P(0) = Pq. Throughout we assume total reachability ifT^ 
Section 15.2], and therefore P(f) > 0 for all f > 0. 

A unitary extension of 0 is somewhat more complicated 
than in the discrete time case. In fact, differentiating 


we obtain 


where 


with 


C(t)=p(f) ixit) 

(56) 

d^ = P(t)^(f)dt -1- Git)dw, 

(57) 

Fit) = Pit)-^ Ait)Pit)^ + Rit), 

(58a) 

Git)=Pit)-iBit) 

(58b) 

m= 

(59) 

d^ = Pit)~^dx + Rit)^it)dt. 

(60) 


Proof: As is well-known, the solution of 0 can be 
written in the form 

= $(f, s)^(s)-f / ^it,T)GiT)dw, (66) 

J S 

where s) is the transition matrix with the property 

— (f,s) = P(f)$(f,s), $(s,s)=/„ (67a) 

ot 

— (i,s) =-$(i,s)p(s), $(f,f)=/„ (67b) 

Let s < t. Then, in view of ( [6^ . a straight-forward calculation 
yields 

wit) — wis) = wit) — w(s) 

— M(f,s)^(s)— f Mit,T)GiT)dw, (68) 

J S 

M(f, s)= f GiTy^iT,s)dT. 

J S 


where 


(69) 


Therefore, 

E{[f()(f) - wis)][wit) - wis))'} = Ipit - s) + A(f, s), 

where 

Ait,s) = Mit,s)Mit,sy + f M(f,T)G(r)G(r)'M(f,r)'dr 

J S 

- j [M(f,T)G(r)-I-G(r)'M(t,r)'] dr. 

J S 

However, A(f, s) is identically zero. To see this, hrst note that 

^it,s) = -Mit,s)Fis)-Gisy. 


In fact. 


(70) 
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Then, in view of a simple calculation shows that 

Since A{t,t) = 0, the assertion follows. Hence the incremental 
covariance is normalized. 


Next, we show that w{t) has orthogonal increments. To 
this end, choose arbitrary times s<t<a<b on the interval 
[0,r], where we choose a and b fixed, and show that 


Q(t, s) := E{[w(&) - w(a)][w(f) - w(s))'} 
is identically zero for all s < t. Using ( |68] l and 
w{b) — w{a) = w{b) — w{s) — M{b, a)$(a, s)^(s) 

— M(b,a) f ^{a,T)G{T)dw — 

J S 

computed analogously, we obtain 


M(&, T)dw 


Q{t,s) = M{b,a) 


^{a, s)M{t, s)' — j ^{a,T)G(T)dT 

J S 

+ f ^{a,T)G{T)G{TyM(t,T)dT 
J S 


Then, again using ( |6T] ), we see that 

dM, , „ 

SO, since Q{t,t) = 0, we see that Q{t,s) is identically zero, 
establishing that w{t) has orthogonal increments. 


Finally, we use the same trick to show ( |65] ). In fact, for 
s < t, ( |66l ) and ( |68| ) yield 

E{[w(i) - w(s))C(f)'} = -M{t, s)$(f, s)' 

G'(r)'<i)(f, r)'dT — f M{t,T)G[T)G{T)')^{t,T)'dr, 
J S 

the partial derivative of which with respect to s is identical 
zero; this is seen by again using ( |M] ). Therefore, since ( |65| l is 
zero for s = f, it is identical zero for all s < f, as claimed. 
This concludes the proof of Proposition ■ 

Consequently, (|57li and (|64| form a forward unitary system 



dx = A{t)x(t)dt + B{t)dw (71a) 

dw = dw — B{t)'x{t)dt, (71b) 

The corresponding backward unitary system is obtained 
through the transformation 

x{t)=P{tYH{t), (72) 

which yields 

dx = P{t)~^dy, + R{t)y,{t)dt. (73) 

This together with ( |62l l and ( |63] l yields 

dx = —A{t)'x{t)dt + B{t)dw (74a) 

dw = B[t)'x{t)dt + dw, (74b) 


B. Time reversal in continuous-time systems 

Next we derive the backward stochastic system corre¬ 
sponding to the non-stationary forward stochastic system 

dx = A{t)x{t)dt + B{t)dw, a:(0) = xq (75a) 

dy = C{t)x(t)dt + D{t)dw, j/(0) = 0 (75b) 

defined on the finite interval [0, T], where xq (with covariance 
Po) and the normalized Wiener process w are uncorrelated. 
To this end, apply the transformation 

x{t) = P{t)~^x(t) (76) 

together with ( |74b| i to ( |75b| ) to obtain 

dy = C{t)x{t) + D{t)dw, 

where 

C{t) = G(t)P{t) + D{t)B{t). (77) 

This together with ( |74a| i yields the the backward system 
corresponding to ( |75| ), namely 

dx = —A{t)'x{t)dt -\- B{t)dw (78a) 

dy = C{t)x{t)dt -f D{t)dw. (78b) 

with end-point condition x{T) = P{T)~^x{T) uncorelated to 
the Wiener process w. 

The backward realization CD was derived in El, but in 
cumbersome way, requiring the proof that w{t) is a normalized 
process with orthogonal increments to be suppressed. What 
is new here is imposing the unitary map between w and w, 
making the analysis much simpler and more natural. 


V. Kalman filtering with missing observations 


We consider the linear stochastic system which does 
not have a purely deterministic component that enables exact 
estimation of components of x from y, an assumption that we 
retain in the rest of the paper. In the engineering literature is 
often the case that the stochastic system CD represented as 

x(t) = A(t)x(t) -h B{t)w{t), x(0) = Xq (79a) 

y(t) = G(t)x(t) -f D{t)w{t) (79b) 


where the formal “derivative” w is white noise, i.e., 
P{w{t)w{sy} = I6(t — s) with S{t — s) being the Dirac 
“function”. Of course x, y and w are to be interpreted 
as generalized stochastic processes. From a mathematically 
rigorous point of view, observing y makes little sense since, 
for any fixed t, y{t) has infinite variance and contains no 
information about the state process x. However, observations 
of y could be interpreted as observations of the increments dy 
of y in a precise meaning to be defined next. On the other 
hand, one can think of (|75|l as a system of type 


dz = M{t)z{t)dt N{t)dw{t), where z{t) 


'x{t) 








and one would like to determine the optimal linear least- 
squares estimate of x{t) given past observed values of y. 

Generally this distinction between observing y or dy is 
not important. However, when there is loss of information 
over an interval {ti,t 2 ), there are two different information 
patterns depending on whether dy or y is observed. The 
difference consists in whether Ay := y{t 2 ) — y{ti) is part 
of the observation record or not. These two cases will be 
dealt with separately in subsections below. In fact, the former, 
which is common in engineering applications, is provided as 
a simplified preliminary, whereas our main interest is in the 
latter. To this end, we first introduce some notation. 

Consider the stochastic system on a finite interval 
[0,r]. As before, let H be the Hilbert space spanned by 
{wk{t) — Wk{s)] s,t S [0,T], fc = 1,2,.. .,m}, endowed 
with the inner product (A,p) = E{Ap}. For any A S H 
and any subspace A, let denote the orthogonal projection 
of A onto A. We denote by 'H.^f^ .i.^j{dy) the (closed) sub¬ 
space generated by the components of the increments of the 
observation process y over the window [ti,t 2 ]- In particular, 
we shall also use the notations ii^{dy) := H[Q f](dt/) and 

Suppose that the output process or its increments are 
available for observation only on some subintervals of [0,T], 

O 

namely 1^, k = 1,2,... ,v. Next we want to define H as the 
proper subspace of ll[Q T]{dy) spanned by the observed data. 
In the case that only the increments dy or, equivalently, the 
“derivative” y is observed, we simply define 

H := Hii {dy) V Hi^ {dy) V • • • V Hx„ {dy). 

In the case that the process y is observed, we need to expand 

O 

H by adding the subspaces spanned by the increments Ay over 
the complementary intervals without observation. In either 
case, we define 

UT :=UnIli{dy) and Hf := H n H+(d 2 /). (80) 

Then Kalman filtering with missing observations amounts to 
determining a recursion for x- where 

o 

ax-{t) ax{t), for all a e K”. (81) 

A. Observing dy only 

When observations are available on the interval [0,ti], the 
Kalman filter on that interval is given by 

dx- = A{t)x-{t)dt K_{t){dy{t) — C{t)x-{t)dt) 

(82a) 

K_ = {Q_C' + BD')R-^ (82b) 

Q_{t) = AQ_+Q_A'-K_RK'_+BB' (82c) 


with R{t) = D{t)D{ty and initial conditions a;_(0) = 0 and 
(5(0) = Pq. Here Q-{t) is the error covariance 

Q-{t) := E{[a::(f) - x_{t)]{x{t) - x_{t)y}, (83) 

which, by the nondeterministic assumption, is positive definite 
for all t. 

Next suppose the observation process becomes unavailable 
over the interval \ti, 12 ) C [0, T], Then the Kalman filter needs 
to be modified accordingly. In fact, for any t G [^ 1 ,^ 2 ), iB 

o 

holds with the space of observations HT := H)~{dy), and 
consequently 

a'x-{t) = a'x{t) = a'^{t,ti)x-{ti). 

This corresponds to setting K- {t) = 0 in ( |8^ on the interval 
\ti, t) so that 

dx- = A{t)x-{t)dt (84a) 

with initial condition X-{ti) given by ( |82a| ). The error covari¬ 
ance (5- is then given by the Lyapunov equation 

Q-{t) = AQ_+Q_A'+ BB' (84b) 

with initial the condition Q-{ti) given by the value produced 
in the previous interval. 

Then suppose observations of dy become available again 
on the interval [t 2 ,t 3 ). Then, for any t G [< 2 ,^ 3 ), we have 

= H[oxi] V 

so the Kalman estimate is generated by ( [82l l but now with 
initial conditions X-{t 2 ) and Q-{t 2 ) being those computed 
in the previous step without observation. In the case there are 
more intervals, one proceeds similarly by alternating between 
filters ( |82| ) and ( [84l i depending on whether increments dy are 
available or not. 

In an identical manner, a cascade of backward Kalman 
filters generates a process x+{t) based on the backward 
stochastic realization and the observation windows [t, T], 
Assuming that there are observations in a final interval ending 
at t = T, on that interval the Kalman estimate 

a'x+{t) = E^* a'x{t), (85) 

O 

with initial observation space := H[j xj, is generated by 
the backward Kalman filter 

dx+ = —A{t)' x+{t)dt 

+ Kj^{t){dy{t) — C{t)x+{t)dt) (86a) 

k+ = -{Q+C' - BD')R-^ (86b) 

Q+ = -A'Q+ - Q+A -f K+R{t)K+{t)' - BB' (86c) 

and initial conditions x+{T) = 0 and (5+(T) = P{T) for x+ 

and the error covariance 

Q+{t) := E{[x(f) - x+{t)][x{t) - x+{t)y}, (87) 
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which like Q- {t) is positive definite for all t. During periods 
of no observations of dy, we then set the gain K+ = 0. This 
update is obtained from the backward time stochastic model 
( |74| l in an identical manner to that of ( |84l l. 

Consequently, both the underlying process as well as the 
filter can run in either time-direction. This duality becomes 
essential in subsequent sections where we will be concerned 
with smoothing and interpolation. 


where 


u{ti) = 





v{ti) 


and Bd and Dd are chosen so that 


Ht2,s)BM(sY\ 

\DdJ^ V M(s)M{sY ) 


ds 


while E{v{ti)v{tiy} = I. 


B. Observing y 


Now consider the case that y, and note merely dy, is 
available for observation on all intervals Ik, k = 1,2,... ,v. 
Under this scenario and with a continuous-time process the 
dynamics of Kalman filtering become hybrid, requiring both 
continuous-time filtering when data is available as well as a 
discrete-time update across intervals where measurements are 
not available. 


Then on the first interval [0, fi] the Kalman estimate ( |82l ) 
will still be valid. However, when t reaches the endpoint t 2 
of the interval of no information and an observation of y is 
obtained again, the subspace of observed data becomes 

H*- = H,- V H(Ay), 


where Ay := y{t 2 ) — y{ti). Computing x{t 2 ) across the win¬ 
dow (^ 1 ,^ 2 ] as a function of x{ti) and the noise components 
we have that 

/ t2 

$(f 2 , s)Bdw{s) 


Ad 




while 


2 /(^ 2 ) = 2/(fi) + / C{t)x{t)dt+ D{t)dw{t). 

J t\ d t\ 

Therefore, 

^y=[ C{t)<^{t,ti)dt)x{ti)+U 2 iti) 

Jtl 


Cd 


where 


U 2 {ti) = [ C{t) f ^{t, s)B{s)dw{s)dt 

Jti Jt 


-I- / D{s)dw{s) 


*2 / /•t2 


s)dtB{s) -f D{s) ) dw{s). 


M(s) 

Thus, we obtain the discrete-time update 


x{t 2 ) = Adx{ti) + Bdv{ti) (88a) 

Ay = Cdx{ti) + Ddv{ti) (88b) 


Hence, across the window of missing data the Kalman state 
estimate x- is now generated by a discrete-time Kalman-filter 
step 

X-{t 2 ) = AdX-{ti) + Kd{Ay - CdX-{ti)) (89a) 

Kd = iAdQ{h)C'd + BdD'd) 

x{CdQ{h)C'd + DdD'd)-^ (89b) 

with initial conditions X-{ti) and Q{ti) given by ( [82l i and 
the error covariance at t 2 by 

<3(^2) = AdQ{ti)A'j^ — Kd{CdQ[ti)C'd 

+ DdD'd)K'd + BdB'd. (89c) 

In the next interval [^ 21 ^ 3 ]^ where observations of y are 
available, the new Kalman estimate iB with 

V H(Aj/) V 

is again generated by the continuous-time Kalman filter ( | 8 ^ 
starting from x-{t 2 ) and Q{t 2 ) given by ( |89] l. 

Again given an observation pattern, where intermittently 
y becomes unavailable for observation, the Kalman estimate 
can be generated in precisely this manner by a cascade 
of continuous and discrete-time Kalman filters. 

Remark 2: The observation pattern of a continuous-time 
stochastic model, where y becomes unavailable over particular 
time-windows, is closely related to hybrid stochastic models 
where continuous-time diffusion is punctuated by discrete-time 
transitions. Indeed, unless interpolation of the statistics within 
windows of unavailable data is the goal, the end points of such 
intervals can be identified and the same hybrid model utilized 
to capture the dynamics. 

Remark 3: A common engineering scenario is the case 
where the signal is lost while the observation noise is still 
present. This amounts to having (7 = 0 over the corresponding 
window, and the Kalman estimates are obtained by merely 
running the filters ( [ 8 ^ and ( [ 86 | ) in the two time directions with 
the modified condition on C. This situation does not cover 
the information patterns discussed above since, whenever 
BD' 0, the Kalman gains do not vanish and information 
about the state process is available even when C is zero. 
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C. Smoothing 

Given these intermittent forward and backward Kalman 
estimates, we shall derive a formula for the smoothing estimate 

ax{t) := ax{t), a € M”, (90) 

valid for both the cases discussed above, where 

H:=Hr VH^CH[o,r](d2/) (91) 


proving condition (i). Condition (ii) follows from a symmetric 
argument. To prove (iii) we use condition ( |9^ . To this end, 
first note that, by the usual projection formula, 

ax-{t) = E{ax-{t)x+{t)}P+{t)~^x+{t) 

= a' E{a;_(t)a;+(f)'}a;+(f), 

where x+{t) := P+{t)~^x+{t) is the dual basis in X+(f) such 
that E{a;+(t)a;+(f)'} = I. Moreover, 


is the complete subspace of observations. This is discussed 
next. 


VI. Geometry oe eusion 


Consider the system and let X(t) be the (finite¬ 

dimensional) subspace in H spanned by the components of the 
stochastic state vector x{t). Then it can be shown ifTSl Chapter 
7] that H[o.t](dy) T | X*, where A T B | X 

denotes the conditional orthogonality 

(a - E^ a, /3 - E^ /3) = 0 for all a G A, ^ G B. (92) 


Next, let X__ (f) and X+ (t) be the subspaces spanned by the 
components of the (intermittent) Kalman estimates X-{t) and 

O 

x+(t), respectively. Then since X_(f) C Hj" C H[o, 

O 

and X+(f) C Ht C (dy), we have 

X_(f)TX+(f) |X(f), 


which is equivalent to 

E^+(‘) a'x-(t) = E^+(*) E^(‘) a'x-{t), a G M” 

M Proposition 2.4.2]. Therefore the diagram 



|x_N< ^E^+ lx 

X 


(93a) 


(93b) 


commutes, where the argument t has been suppressed. 


E^^‘^ a'x-{t) = E{a'x-{t)x{ty}P{t) ^x{t) 

= a'E{x-{t)x{ty}x{t) = a'P-{t)x{t), 

where we have used condition (i) and ( |76l l. Next, set b := P_o 
and form 

E^+(‘) b'x{t) = E{b'xit)x+it)}P+{t)-^x+it) 

= b'E{x{t)x+{t)}x+{t) 

= b'P+{t)x+{t), 

by condition (ii), and consequently 

gx+(t) E^‘^*'>a'x-{t) = a'P-{t)P+{t)x+{t). (95) 

Then condition (iii) follows from ( |9^ , ([^ and ■ 

Remark 5: The proof of condition (iii) in Lemma could 
be simplified if x+ were a regular backward Kalman estimate 
without intermittent loss of information. In this case, x+ = 
P^^x^ would be generated by a forward stochastic realization 
belonging to the same class as and E{x+{t)x-{ty} = 
P+{t) E{x+{t)x-{t)} = P+{t) E{x_(f)x_(f)}. 

Lemma 6: For each t G [0,r], the smoothing estimate 
x{t), defined by ®, is given by 

a'x(t) = a'x{t), a G K", (96) 

where is the subspace 

H° =X_(f)VX+(f). (97) 


Lemma 4: Let x{t), x{t), X-{t) and x+{t) be defined as 
above. Then, for each t G [0,T], 

(i) E{x(f)a;_(f)'} = P_(t) 

(ii) E{x{t)x+{ty} = P+p) 

(iii) E{x+{t)x-{ty} = P+{t)P-{t), 

where P-{t) := E{x_(f)x_(f)'} is the state covariance of the 
Kalman estimate X-{t) and P+{t) := E{x+(f)x+(f)'} is the 
covariance of the backward Kalman estimate x+(f). 

Proof: By the definition of the Kalman filter, ([8T]| holds, 
and consequently the components of the estimation error 
x{f) — X-{t) are orthogonal to Hj" and hence to the com¬ 
ponents of x-{f). Therefore, 

E{x{t)x-(ty} = E{x-{t)x-{ty} = p-{t), 


Proof: Following iflTl . 13, ifTSll . define N {t) := Hj 0 

X_(f) and N+(f) := 0 X+(f). Then 

H = N-(f)0H°0N+(t). 

O 

Now, a'{xif) — X-{t)) is orthogonal to Hj" and hence to 
N“(f). Also a'x-(t) 0 N“(f). Hence a'x(t) 0 N“(f) as 
well. In the same way we see that a'x{t) 0 N+(f:). Therefore 
( |9^ follows. ■ 

Consequently, the information from the two Kalman filters 
can be fused into the smoothing estimate 

x{t) = L-{t)x-{t) + L+{t)x+{t) (98) 

for some matrix functions L_. and L+. 
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VII. Universal two-filter formula 

To obtain a robust and particularly simple smoothing for¬ 
mula that works also with an intermittent observation pattern, 
we assume that the stochastic system ( [75] l has already been 
transformed via ( [58] l so that, for all t G [0,T], 

x(t) = x{t) (99) 

and therefore 

p{t) = E{xit)xity} = I = P{t). ( 100 ) 

Then the error covariances in the filtering formulas of Sec¬ 
tion El are 

Q_=I-P_ and Q+=I-P+. (101) 

Consequently, x{t), x{t), P-(t) and P+{t) are all bounded in 
norm by one for all t G [0, T]. 

Theorem 7: Suppose that ( |99l l holds. For every t G [0,T], 
we have the formula 

x(t) = Q(t) (Q_(t)~^x_(t) + Q+(t)~^x+(t)) (102) 

for the smoothing estimate ( |90| ), where the estimation error 
Q(t) := E{(x(t) - x(t)) (x(t) - xp))'} (103) 

is given by 

Q{t)-^=Q.{t)-^ + Q+{t)-^-I, (104) 


Then P02| l follows from ( |98l l and ( |107| l. To prove ( |104| ) 
eliminate Z_|_ in ( |106| ) to obtain 

L_(/-P_P+)=Q+, 

which together with ( |107| i yields 

Q-^ = QZ\I - P-P+)Qy\ 

However, 

I - P-P+ = Q+ + Q-- Q-Q+, 

and hence ( |104| i follows. ■ 

In the special case with no loss of observation this is a 
normalized version of the Mayne-Frazer two-filter formula HI, 
0, which however in ill, 0 was formulated in terms of 
X- and x+ rather than a;+, where x+ is the state process of 
the forward stochastic system of the backward Kalman hlter. 
(For the corresponding formula in terms of X- and x+, see 
0, M; also cf. Ga, where an independent derivation was 
given.) With a single interval of loss of observation the formula 
( |102| i reduces to a version of the interpolation formulas in 
0. The remarkable fact, discovered here, is that the same 
formula ( |102| i holds for any intermittent observations structure 
and by a cascade of continuous and discrete-time forward and 
backward Kalman filters, as needed depending on the assumed 
information pattern. 

VIII. Recap of computational steps 


and where x_, x+, Q- and Q+ are given by ( [82| ) and ( |86l ) 
with boundary conditions x_(0) = x+(T) — 0 and Q_(0) = 
Q+(T) = I. 

Proof: Clearly the matrix functions L_ and P+ in ( |98| ) 
can be determined from the orthogonality relations 


o 

II 

1 

il 

1 

(105a) 

E{[x(f) — x{t)]x+{ty} = 0. 

(105b) 


By Lemma 0 ( |105| ) yields 

P_ - L_P_ - L+P+P- = 0 
P+ - P_P_P+ - L+P+ = 0, 

which, in view of the fact that P_ and P+ are positive dehnite, 
yields 

L_ -f L+P+ = I (106a) 

L_P_+L+=I (106b) 

Again by orthogonality and Lemma 

Q = E{{x - x)x') = I - L_P_ -L+P+, 
which, in view of ( |106| l and the relations ( |101| ), yields 

L_=QQZ^ and L+= QQf}. (107) 


Given a system ( |75] l with state covariance ( [55| l, make the 
normalizing substitution 

A{f) G- P{t)-^A(t)P{t)i + R{t) 

B{t)^ P(t)-"^B(t) (108) 

C(t) G- C{t)P{t)^ 


with R(t) = Next, we compute the 

intermittent forward and backward Kalman hlter estimates x_ 
and x+, respectively, along the lines of Section |Vj where, 
due to the normalization, (5_(0) = Q+{T) = In- Then the 
smoothing estimate is given by 


x{t) = Q{t) {Q-{t) '^x-{t) + Q+{t) ^x+{t)) , 


where 

Q(f) = (Q_(f)-i + Q+(t)-i-/)■'. 


IX. An example 

We now illustrate the results of the paper on a specihc 
numerical example. We consider the continuous-time diffusion 
process 

dxi{t) = X 2 {t)dt 

dx 2 {t) = —0.3xi{t)dt — 0.7x2{t)dt + dw(t) 

dy{t) = xi{t)dt + dv(t) 
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where w and v are thought to be independent standard Wiener 
processes. Here, xi is thought of as position and X 2 as velocity 
of a particle that is steered by stochastic excitation in dw, in the 
presence of a restoring force O.Sxi and frictional force 0.7x2. 
Then dy/dt represents measurement of the position and dv/dt 
represents measurement noise (white). 


Output process and states 



Fig. 5: Sample paths of output process, increment, and state 
processes 


of intervals, data are not made available for state estimation; 
these intervals where data are not to be used are marked by 
a thick blue baseline in the figures. In Figure we display 
sample paths of the output process y, increments dy, and state- 
processes Xi and X 2 - 


KF backward estimate - missing data in blue intervals 




Time [sec] 

Fig. 7: Kalman estimates in the backward time direction 


KF forward estimate - missing data in blue intervals 
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Fig. 6; Kalman estimates in the forward time direction 
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Fig. 8; Interpolation/smoothed estimates by fusion of Kalman 
forward and backward estimates 


Numerical simulation over [0,T] with T = 45 (units of 
time) produces a time-function y{t) which is sampled with 
integer multiples of At = 0.01 (units). The interval [0,T] is 
partitioned into 

[0,T] = uti[U-i,U] 

where to = 0 and ti — ti_i = i (units). Measurements of y 
are made available for purposes of state estimation over the 
intervals [ti-i,ti] for i = 1,3, 5,9. Over the complement set 


The process increments dy over [ti-i,ti] for i = 1,3, 5, 9 
as well as the increments Ay across the [ti-i,ti] for i = 
2,4, 6,8 are used in the two-filter formula for the purpose 
of smoothing. The Kalman estimates for the states in the 
forward and backwards in time directions, x_(t) and x+{t) 
are shown in Figures and respectively. The fusion of the 
two using ( |102[ ) is shown in Figure]^ It is worth observing the 
nature and fidelity of the estimates. In the forward direction, 
across intervals where data is not available, x_ becomes 
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increasing more unreliable whereas the opposite is true for 
x+, as expected. The smoothing estimate is generally an 
improvement to those of the two Kalman filters as seen in 
Figure In particular, it is worth noting X 2 (in subplot 2), 
where, over windows of available observations, estimates have 
considerably less variance in the middle of the interval where 
the weights and Q{t)Q^(t)~^) in ( |102| l are 

equalized, whereas sample paths become increasing rugged 
at the two ends where one of the two Kalman estimates has 
significantly higher variance, and the corresponding mixing 
coefficient becomes relatively smaller. 

X. Concluding remarks 

Historically the problem of interpolation has been consid¬ 
ered from the beginning of the study of stochastic processes 
EH . E^ . Early accounts and treatments were cumbersome 
and non-explicit as the problem was considered difficult Q, 
iSl . ||9ll . ifTOl . In a manner that echoes the development of 
Kalman filtering, the problem became transparent and com¬ 
putable for ouput processes of linear stochastic systems 13, 

0, GSl- 

This paper builds on developments in stochastic realization 
theory HD, M and presents a unified and generalized two- 
filter formula for smoothing and interpolation in continuous 
time for the case of intermittent availability of data over 
an operating window. The analysis considers two alternative 
information patterns where increments of the output process or 
the output process itself is recorded when information becomes 
available. The second information pattern appears most natural 
to us in this continuous-time setting, and this is our main 
problem. Nevertheless, in either case, two Kalman filters run in 
opposite time-directions, designed on the basis of a forward 
and a backward model for the process, respectively. Fusion 
of the respective estimates is effected via linear mixing in 
a manner similar to the Mayne-Fraser formula and applies to 
both smoothing and interpolation intermixed. In earlier works, 
smoothing and interpolation have been considered separate 
problems ifTSl Chapter 15]. The balancing normalization also 
simplifies the mixing formula and makes it completely time 
symmetric. 

The theory relies on time-reversal of stochastic models. 
We provide a new derivation of such a reversal which has the 
convenient property of being balanced. It is based on lossless 
imbedding of linear systems and effects the time reversal 
through a unitary transformation. Interestingly, time symmetry 
in statistical and physical laws have occupied some of the most 
prominent minds in science and mathematics. In particular, 
closer to our immediate interests, dual time-reversed models 
have been employed to model, in different time-directions. 
Brownian or Schrodinger bridges ll25l . Il2^ . a subject which is 
related to reciprocal processes ll27l . 1281, l29l . OOl . A natural 


extension of the present work in fact is in the direction of 
general reciprocal dynamics 1281 . 1^ and the question of 
whether similar two-filter formula are possible. 


Appendix; Time reversal of non-stationary 

DISCRETE-TIME SYSTEMS 


Next, instead of Q’ consider the non-stationary state 
dynamics 


x{t + 1) = A{t)x{t) + B{t)w{t), t( 0) = To, (109) 

on a finite time-window [0,T], where, for simplicity we 
now assume that the covariance matrix Pq := P{0) of 
the zero-mean stochastic vector xq is positive definite, i.e., 
Pq = E{toTo} > 0. Then the state covariance matrix 
P{t) := E{a;(f)T(f)'} will satisfy the Fyapunov difference 


equation 

p(t +1) = A(t)p{t)A{ty + B{t)B{ty. (110) 
The state transformation 

C(f) =P(f)-5T(f) (111) 

brings the system 1109| l into the form 

^^t + l) = F{t)at) + G{t)wit), (112) 

where now E{^(f)^(f)'} = for all t and 

F{t) = P{t + l)-^A{t)P{t)^, (113a) 

G(t) = P{t + l)-^B. (113b) 

The Fyapunov difference equation then reduces to 

In = F{t)F{ty+ G{t)G{ty (114) 


allowing us to embed [F, G] as part of a time-varying orthog¬ 
onal matrix 


U{t) 


F{t) G{t) - 
H{t) J{t) _ ■ 


This amounts to extending (|1 \2\ to 


(115) 


-I-1) = F{t)i{t) + G{t)w{t) (116a) 

w{t) = H(t)y,{t) + J{t)w{t), (116b) 


or, in the equivalent form 

'at+ 1)' 

. tiiit) ^ 


U{t) 


m' 

w(t)_ 


(117) 


Hence, since E{^(f),f(f)'} = /„ and E{r(;(f)r(;(f)'} = Jp, and 
assuming that E {^(f)'u;(f)'} = 0, 


E. 


'at + 1) 

{ [ w{t) . 

which yields 


at +1) 


U{t)U{ty = In+p, (118) 


E{e(f + l)u;(f)'} = 0, (119a) 

F{w{t)u{ty} = Ip. (119b) 
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Moreover, from ( |1 16| l we have 
u(t + k) = H{t + k)^(t + fc, 

for A: > 0, where 

^^^iF{s-l)F{s-2)---F{t) for s>t 
for s = t. 

Therefore, since F(t)F[{tY + G(t)J{ty = 0 by the unitarity 
of U{t), 

E{rt(f + k)u{ty} 

= H{t + k)^t + k,t+ l)[F{t)H{ty + G{t)J{ty] = 0 . 

Consequently, it is a white noise process. Finally, premulti¬ 
plying ( |1 17| l by U{ty, we then obtain 

= F{ty^{t -f 1) -f H{tyw{t) (120a) 

w{t) = G{ty^(t -f 1) -f J{tyiju(t), (120b) 

which, in view of ( |1 19| l, is a backward stochastic system. 
Using the transformation (|lll|i, (|1 16|l yields the forward 


representation 

x{t + 1) = A{t)x{t) + B{t)w{t) (121a) 

w(t) = B{tyx{t) F J{t)w{t), (121b) 

where B{t) := P{t)~iH{ty. Likewise ( |120| i and 

xit) = P{t + l)-^x{t + l), (122) 

yields the backward representation 

x{t — 1) = A{tyx{t) + B(t)w{t) (123a) 

w(t) = B{tyx(t) + J(tyw{t). (123b) 


Remark 8: When considered on the doubly infinite time 
axis, equation ra defines an isometry. Indeed, assuming that 
the input is squarely summable, the fact that U(t) is unitary 
for all t directly implies that 

N N 

+ ||^(f + l)f = ^|k(f)f. 

— oo —oo 

Then, ^(f) —>■ 0 as f —>■ oo, provided $(f, s) —>■ 0 as s —>■ —oo. 
It follows that 

OO OO 

E = E 

t— — oo 

We are now in a position to derive a backward version of 
a non-stationary stochastic system 

x{t + 1) = A{t)x{t) + B{t)w{t), x{0) = Xq (124a) 

y{t) = G{t)x{t) + D{t)w{t) (124b) 


where xq and the normalized white-noise process w are 
uncorrelated and E{a;oa:o} = Pq- In fact, inserting the trans¬ 
formations \\22) and ( |123a| i into ( |124b| ) yields 

y{t) = Cx{t) + Dw{t), 

where 

C = G{t)P{t)A{ty + D{t)B{ty (125) 

D = G{t)P{t)B{t) + D(t)J{ty (126) 

From that we have the backward system 

x(t — 1) = A{tyx(t) + B{t)w{t) (127a) 

y{t) = C{t)x{t) -I- D{t)w{t) (127b) 

with the boundary condition x{T — 1) = P{T)~^x{T) being 
uncorrelated to the white-noise process w. 
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